The Application of Statistical Relational Learning to a Database of Criminal and Terrorist Activity

نویسندگان

  • Brian Delaney
  • Andrew S. Fast
  • William M. Campbell
  • Clifford J. Weinstein
  • David D. Jensen
چکیده

We apply statistical relational learning to a database of criminal and terrorist activity to predict attributes and event outcomes. The database stems from a collection of news articles and court records which are carefully annotated with a variety of variables, including categorical and continuous fields. Manual analysis of this data can help inform decision makers seeking to curb violent activity within a region. We use this data to build relational models from historical data to predict attributes of groups, individuals, or events. Our first example involves predicting social network roles within a group under a variety of different data conditions. Collective classification can be used to boost the accuracy under data poor conditions. Additionally, we were able to predict the outcome of hostage negotiations using models trained on previous kidnapping events. The overall framework and techniques described here are flexible enough to be used to predict a variety of variables. Such predictions could be used as input to a more complex system to recognize intent of terrorist groups or as input to inform human decision makers. 1 Background and Motivation During the last decade, there has been an increasing effort toward data collection on criminal and terror networks using open source materials (e.g. news articles, police reports, and court documents.) A straightforward use of such data includes manual analysis of groups and individuals involved in nefarious activity to inform key decision makers tasked with preventing future bombings or other violent attacks. However, if the collection is detailed with specific annotations including continuous variables and categorical fields, the application of statistical machine learning becomes possible. An example of such an analysis is shown in [1], where the author used statistical methods to indentify extremist ∗This work was sponsored by the Department of Defense under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the United States Government. †MIT Lincoln Laboratory, Information Systems Technology Group ‡University of Massachusetts Amherst, Knowledge Discovery Laboratory. A. Fast now at Elder Research Inc. groups responsible for surprise terror attacks. By modeling past behavior, statistical techniques can help find large scale patterns in the data and possibly be used to prevent or inform future activities. This paper investigates the use of statistical machine learning to predict individual attributes and event outcomes from a graphical representation of a relational database of terrorist activity. We apply statistical relational learning algorithms to predict leadership roles of individuals in a group based on patterns of activity, communication, and individual attributes. Using labeled training data, we apply supervised learning to build a model which describes the structures and patterns of leadership roles. The relational model returns a probability that a particular person is in a leadership role given a graphical representation of the individuals activities and attributes. A held out test set is used for evaluation and receiver operator curves (ROC) for correct prediction of leadership is presented. A more complex model is applied to give improved performance in a more realistic ”data poor” test condition. Such features can be important components of an overall automatic threat detection system such as the one presented in [2]. In such a system, automatic identification of individual roles and activities from basic features can help infer intent of groups and individuals through higher-level pattern recognition and social network analysis. In addition to predicting attributes of individuals, we use the relational model to predict the outcome of an event, in this case, the fate of a hostage in a kidnapping event. Given a particular hostage taking event, the system will be able to predict the probability that the hostage will be released or killed based on known properties of the event. Features in the this model might include ransom demands and payment, regions and countries of the event, hostage nationality, and groups or individuals involved along with their past activities. Each of these features indicates the likelihood that a successful hostage release can be negotiated. The aggregration of relational features such as the percentage of hostages released by similar groups in the past can be used to improve performance. Aggregation 409 Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

Metadata Enrichment for Automatic Data Entry Based on Relational Data Models

The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

View Learning for Statistical Relational Learning: With an Application to Mammography

Statistical relational learning (SRL) constructs probabilistic models from relational databases. A key capability of SRL is the learning of arcs (in the Bayes net sense) connecting entries in different rows of a relational table, or in different tables. Nevertheless, SRL approaches currently are constrained to use the existing database schema. For many database applications, users find it profi...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010